CH 105 - Chemistry and Society

Biochemical Structures

08/24/2010

Carbohydrates

Monosaccharides Structures

Sugars can be defined as polyhydroxy aldehydes or ketones. Hence the simplest sugars contain at least three carbons. The most common are the aldo- and keto-trioses, tetroses, pentoses, and hexoses. The simplest 3C sugars are glyceraldehye and dihydroxyacetone.

Glucose, an aldo-hexose, is a central sugar in metabolism. It and other 5 and 6C sugars can cyclize through intramolecular nucleophilic attack of one of the OH's on the carbonyl C of the aldehyde or ketone. Such intramolecular reactions occur if stable 5 or 6 member rings can form. The resulting rings are labeled furanose (5 member) or pyranose (6 member) based on their similarity to furan and pyran. On nucleophilic attack to form the ring, the carbonyl O becomes an OH which points either below the ring (α anomer) or above the ring (β anomer).

Monosaccharides in solution exist as equilbrium mixtures of the straight and cyclic forms. In solution, glucose (Glc) is mostly in the pyranose form, fructose is 67% pyranose and 33% furanose, and ribose is 75% furanose and 25% pyranose. However, in polysaccharides, Glc is exclusively pyranose and fructose and ribose are furanoses.

Here are the simple 6C sugars you should know:

Jmol: glucose, galactose, and fructose

Many chemical derivative of sugars are found in the body as shown below:

Disaccharides and Polysaccharides

Polymers of simple sugars can be made. The chemistry is based on addition of a nucleophile to the aldehyde C in aldoses, as shown below. This first example shows water adding to an aldehyde whichh undergoes nucleophilic adidtion to form a hydrate. If an alchohol (ROH) is used as the nucelophile, the product is called a hemiacetal. The hemiacetal in acidic conditions (+H⁺) can dehydrate and react with a second alcohol to form an acetal. When an aldose cyclizes, it forms a hemiacetal. If an alcoholic funcitonal group (OH) on another monosaccharide reacts with the hemiacetal, the resulting acetal is a dissacharide, a sugar polymer with two monosaccharides joined. In the reverse process, water can attach an acetal or dissaccharide in a hydrolysis reaction to form two separate monomeric sugars.

This same reaction (acetal formation) can be used to form longer sugar polymers or polysaccharides. Some common polysaccharide of glucose are shown below:

Chime Molecular Modeling: Amylose and Amylopectin

Jmol: Glycogen (Jmol) | Amylose (Jmol) | Amylopectin (Jmol)

Lipids

We've already encounter lipids before when we discussed stearic acid and phospholipids in the section on Intermolecular Forces. What are lipids?

Lipids can be considered to be biological molecules which are soluble in organic solvents, such as chloroform/methanol, and are sparingly soluble in aqueous solutions. There are two major classes, saponifiable and nonsaponifiable, based on their reactivity with strong bases. The nonsaponifiable classes include the "fat-soluble" vitamins (A, E) and cholesterol.

Saponification is the process that produces soaps from the reaction of lipids and a strong base. The saponifiable lipids contain long chain carboxylic acids, or fatty acids, esterified to a “backbone” molecule, which is either glycerol or sphingosine.

Note on nomenclature: Lipids are often distinguished from another commonly used word, fats. Some define fats as lipids that contain fatty acids that are esterified to glycerol. I will use the lipid and fat synonymously.

The major saponifiable lipids are triacylglycerides, glycerophospholipids, and the sphingolipids. The first two use glycerol as the backbone. Triacylglycerides have three fatty acids esterified to the three OHs on glycerol. Glycerophospholipids have two fatty acids esterified at carbons 1 and 2, and a phospho-X groups esterifed at C3. Spingosine, the backbone for spingolipids, has a long alkyl group connected at C1 and a free amine at C2, as a backbone. In spingolipids, a fatty acid is attached through an amide link at C2, and a H or esterified phospho-X group is found at C3. A general diagrams showing the difference in these structures is shown below.

The actual chemical structures of these lipids are shown below.

di-18:0 PC: JmolTriacylglyceride: Jmol Other Lipid Models

The fatty acids that are connected to the glycerol backbone in glycerophosphoplipids and in triglycerides are really long chain carboxylic acids. Naturally occurring fatty acids contain an even number of C atoms and may contain no (saturated fatty acids) or 1 (monounsaturated) or more (polyunsaturated) double bonds in the long C chain. Common fatty acids are shown below.

COMMON BIOLOGICAL SATURATED FATTY ACIDS

Symbol	common name	systematic name	structure	mp(C)
12:0	Lauric acid	dodecanoic acid	CH₃(CH₂)₁₀COOH	44.2
14:0	Myristic acid	tetradecanoic acid	CH₃(CH₂)₁₂COOH	52
16:0	Palmitic acid	Hexadecanoic acid	CH₃(CH₂)₁₄COOH	63.1
18:0	Stearic acid	Octadecanoic acid	CH₃(CH₂)₁₆COOH	69.6
20:0	Arachidic aicd	Eicosanoic acid	CH₃(CH₂)₁₈COOH	75.4

COMMON BIOLOGICAL UNSATURATED FATTY ACIDS

Symbol	common name	systematic name	structure	mp(C)
16:1^D9	Palmitoleic acid	Hexadecenoic acid	CH₃(CH₂)₅CH=CH-(CH₂)₇COOH	-0.5
18:1^D9	Oleic acid	9-Octadecenoic acid	CH₃(CH₂)₇CH=CH-(CH₂)₇COOH	13.4
18:2^D9,12	Linoleic acid	9,12 -Octadecadienoic acid	CH₃(CH₂)₄(CH=CHCH₂)₂(CH₂)₆COOH	-9
18:3^D9,12,15	a-Linolenic acid	9,12,15 -Octadecatrienoic acid	CH₃CH₂(CH=CHCH₂)₃(CH₂)₆COOH	-17
20:4^D5,8,11,14	arachidonic acid	5,8,11,14- Eicosatetraenoic acid	CH₃(CH₂)₄(CH=CHCH₂)₄(CH₂)₂COOH	-49
20:5^{D5,8,11,14,17}	EPA	5,8,11,14,17-Eicosapentaenoic- acid	CH₃CH₂(CH=CHCH₂)₅(CH₂)₂COOH	-54
22:6^{D4,7,10,13,16,19}	DHA	Docosohexaenoic acid	22:6w3

% FATTY ACIDS IN VARIOUS FATS

FAT	<16:0	16:1	18:0	18:1	18:2	18:3	20:0	22:1	22:2	.
Coco-nut	87	.	3	7	2	.	.	.	.	.
Canola	3	.		11	13	10	.	7	50	2
Olive Oil	11	.	4	71	11	1	.	.	.	.
Butter-fat	50	4	12	26	4	1	2	.	.	.

Fatty acids can be named in many ways.

symbolic name: given as x:y ⁽^{Δ a,b,c)} where x is the number of C’s in the chain, y is the number of double bonds, and a, b, and c are the positions of the start of the double bonds counting from C1 - the carboxyl C. Saturated fatty acids contain no C-C double bonds. Monounsaturated fatty acids contain 1 C=C while polyunsaturated fatty acids contain more than 1 C=C. Double bonds are usual cis.
systematic name using IUPAC nomenclature. The systematic name gives the number of Cs (e.g. hexadecanoic acid for 16:0). If the fatty acid is unsaturated, the base name reflects the number of double bonds (e.g. octadecenoic acid for 18:1 ^{Δ
9} and octadecatrienoic acid for 18:3^{Δ 9,12,15}).
common name: (e.g. oleic acid, which is found in high concentration in olive oil)

You should know the common name, systematic name, and symbolic representations for these saturated fatty:

lauric acid, dodecanoic acid, 12:0
palmitic acid, hexadecanoic acid, 16:0
stearic acid, octadecanic acid, 18:0.

Learn the following unsaturated fatty acids -

oleic acid, octadecenoic acid, 18:1 ^{Δ
9}
linoleic acid, octadecadienoic acid, 18:2 ^{Δ 9,12}
α-linolenic acid, octadecatrienoic acid, 18:3 ^{Δ 9,12,15} (n-3)
arachidonic acid, eicosatetraenoic acid, 20:4 ^{Δ 5,8,11,14} (n-6)
eicosapentenoic acid (EPA), 20:5 ^{Δ 5,8,11,14,17} (n-3) Note: sometimes written as eicosapentaenoic
docosahexenoic acid (DHA) 22:6 ^{Δ4,7,10,13,16,19} (n-3) Note: sometimes written as docosahexaenoic

There is an alternative to the symbolic representation of fatty acids, in which the Cs are numbered from the distal end (the n or ω end) of the acyl chain (the opposite end from the carboxyl group). Hence 18:3 ^{Δ 9,12,15} could be written as 18:3 (ω-3) or 18:3 (n-3) where the terminal C is numbered one and the first double bond starts at C3. Arachidonic acid is an (ω-6) fatty acid while docosahexaenoic acid is an (ω -3) fatty acid.

Note that all naturally occurring double bonds are cis, with a methylene spacer between double bonds - i.e. the double bonds are not conjugated. For saturated fatty acids, the melting point increases with C chain length, owing to increased likelihood of van der Waals (London or induced dipole) interactions between the overlapping and packed chains. Within chains of the same number of Cs, melting point decreases with increasing number of double bonds, owing to the kinking of the acyl chains, followed by decreased packing and reduced intermolecular forces (IMFs). Fatty acid composition differs in different organisms:

animals have 5-7% of fatty acids with 20-22 carbons, while fish have 25-30%
animals have <1% of their fatty acids with 5-6 double bonds, while plants have 5-6% and fish 15-30%

Many studies support the claim the diets high in fish that contain abundant n-3 fatty acids, in particular EPA and DHA, reduce inflammation and cardiovascular disease. n-3 fatty acids are abundant in high oil fish (salmon, tuna, sardines), and lower in cod, flounder, snapper, shark, and tilapia.

The most common polyunsaturated fats (PUFAs) in our diet are the n-3 and n-6 classes. Most abundant in the n-6 class in plant food is linoleic acid (18:2n-6, or 18:2^Δ9,12), while linolenic acid (18:3n-3 or 18:3^Δ9,12,15) is the most abundant in the n-3 class. These fatty acids are essential in that they are biological precursors for other PUFAs. Specifically,

linoleic acid (18:2n-6, or 18:2^Δ9,12) is a biosynthetic precursor of arachidonic acid (20:4n-6 or 20:4^Δ5,8,11,14)
linolenic acid (18:3n-3, or 18:3^Δ9,12,15) is a biosynthetic precursor of eicosapentaenoic acid (EPA, 20:5n-3 or 20:5^{Δ5,8,11,14,17}) and to a much smaller extent, docosahexaenoic acid (DHA, 22:6n-3 or 22:6^{Δ4,7,10,13,16,19}).

Micelles and Bilayers:

we have already discussed in the section on Intermoleuclar Forces how fatty acids can self associate to form micelles, while phospholipids associate to form bilayers.

Micelle: Jmol Bilayer: Jmol

Proteins

Proteins are a polymer, consisting of monomers call amino acids. The monomer contains a carbon to which 4 groups are attached:

a carboxylic acid
an amine
a hydrogen atom
one of twenty "side chains" or "R groups" that differentiate the amino acids from each other.

The monomer in a protein is called an amino acid, a completely different kind of molecule than a nucleotide. There are twenty different naturally occurring amino acids that differ in one of the 4 groups connected to the central carbon. In an amino acid, the central (alpha) carbon has an amine group (RNH₂), a carboxylic acid group (RCOOH), (both groups you studied last week) an H, and an R group attached to it.

20 Amino Acids - Structures
20 Naturally Occurring Amino Acids - Molecular Models: Notice the common blue and red groups in al amino acids. Notice the different "R" groups pointing down in each figure.

Amino Acids: Structures

Amino acids form polymers through an attack by the amino group of an amino acid at the carbonyl carbon of the carboxyl group of another amino acid. The carboxyl group of the amino acid must first be activated by ATP to provide a better leaving group than OH^-. The resulting link between the amino acids is an amide link which biochemists call a peptide bond is the same as the amide bond we studied before . In this reaction, water is released. In a reverse reaction, the peptide bond can be cleaved by water (hydrolysis) .

We saw the reverse reaction, the cleavage of an amide by water, when we studied the chemistry of carbonyl groups. Here is a graph the will help you review that process and understand the making of a peptide bond.

carbonyl chemistry and the peptide bond

When two amino acids link together to form an amide link, the resulting structure is called a dipeptide. LIkewise, we can have tripeptides, tetrapeptides, and other polypeptides. At some point, when the structure is long enough, it is called a protein. There are many different ways to represent the structure of a polypeptide or protein. each showing differing amounts of information. .

Both the sequence of a protein and it's total length differentiate one protein from another. Just for an octapeptide, there are over 25 billion different possible arrangement of amino acids. Compare this to just 65536 different oligonucleotides of 8 monomeric units (8mer). Hence the diversity of possible proteins is enormous.

The actual linear sequence of a protein is called its primary (1^o structure).

The protein chain can form regular repetitive secondary structures called alpha helices and beta sheets through the formation of H bonds between the backbone amide H (δ+) of one amino acid and the backbone carbonyl O (δ-) of another amino acid in the protein. These H bonds are all among main chain atoms in the backbone, not among side chains.

The protein ultimately forms a unique 3D shape, which usually contains some alpha helices and beta sheets. This 3 D structure is called the tertiary structure of the protein.

The links below will give you greater insight into

Alpha Helices Jmol - Observe the intrastrand H bond holding the helix together. /Click on the sequential commands in the left window to view the molecule with different renderings.
Twisted Beta Sheets | Jmol - Observe the interstrand H bonds holding the structure together.
Beta Barrel | Jmol - Observe the inter-strand H bonds holding the structure together
Myoglobin | Jmol - an oxygen binding protein - Observe the predominately alpha-helical nature of the protein
Superoxide Dismutase | Jmol - a protein catalyst (enzyme) that detoxifies the body of toxic oxygen byproducts and which high level of the protein have been associated with longer life spans. Observe the predominate beta sheet structure of the protein.
triose phosphate isomerase | Jmol - an enzyme involved in sugar metabolism

WebCT Quiz: Select Protein Structure as the Quiz

Human Protein Reference Database

The structure of proteins is much more complicated than micelles and bilayers. To a first approximation the protein can consist of a polar main chain/backbone from which amino acid side chains of varying natures hang. These side chains are polar, polar charged, and nonpolar. In general the nonpolar side chains prefer to be buried in the center of the protein, surround by other nonpolar side chains and away from polar water. Given the complexity of the structure, however, not all nonpolar side groups can be buried and some are solvent exposed. Likewise, polar and charged polar side chains like to be on the surface exposed to water, but some will find themselves buried. If they are they will be surrounded by polar side chains to stabilize the buried group.

A protein with a buried nonpolar amino acid. What's around it? | Jmol

DNA

DNA is a polymer, consisting of monomers call nucleotides. The monomer contains a simple sugar (deoxyribose), a phosphate group, and a cyclic organic group that is a base (not an acid). Only four bases are used in DNA, which we will abbreviate, for simplicity, as A, G, C and T. They are bases since they contain amine groups that can accept protons. The polymer consists of a sugar - phosphate - sugar - phosphate backbone, with 1 base attached to each sugar molecule. DNA can exist as single-stranded (ss) structure (with one sugar-phosphate backbone), a double-stranded (ds) structure (with two sugar-phosphate backbones which bind to each other through their bases) , or mixed forms. It is actually a misnomer to call dsDNA a molecule, since it really consisted of two different, complementary strands held together by intermolecular forces called hydrogen bonds. The G base on one strand can form 3 H bonds with a C base on another strand (this is called a GC base pair). The T base on one strand can form 2 H bonds with an A base on the other strand (this is called an AT base pair). These H bonds are like the "velcro" attractions that would bind two objects with opposite types of velcro to each other. dsDNA varies in length (number of sugar-phosphate units connected), base composition (how many of each set of bases) and sequence (the order of the bases in the backbone.

DNA: Jmol Tutorial

Structure of a chromosome

Most people have seen pictures of chromosomes viewed through microscopes. Check out this amazing picture of a chromosome taken form Scientific American, September, 1995.

Chromosomes consist of one dsDNA molecule. Each somatic (body) cell of your body has 23 pairs of chromosomes, one member of each pair contributed by your mother and the other by your father. (In germ cells - eggs and sperm - there are 23 individual chromosomes, not chromosome pairs.) One pair are the sex chromosomes, which can come in two forms, X and Y. A pair of X's gives a female, and an XY results in a male.

Human Chromosomes (with an extra copy of Chromosome 21, which causes Down syndrome

The human genome has about 3 billion base pairs of DNA. Therefore, on average, each single chromosome of a pair has about 150 million base pairs, which consists of one molecule of DNA and lots of proteins bound to it. dsDNA is a highly charged molecule, and can be viewed, to a first approximation, as a long rod-like molecule with a large negative. charge. This very large molecule must somehow be packed into a small nucleus. The packing problem is solved by coiling DNA and packing it with proteins, which usually have a net positive charge. The chromosomes are usually dispersed within the nucleus and are not visible with an ordinary microscope. When the cell is ready to divide, the DNA in the chromosomes replicates, and the chromosomes condense in a fashion that they are visible (when stained) using an ordinary microscope. At this point the chromosomes can be stained with a variety of stains (hence the name chromosomes), some of which bind differentially to different chromosomes. The different chromosomes can hence be distinguished by their size, shape, and dye-binding properties.

Human Chromosomes

The standard picture of a chromosome with which you are familiar, including the one shown above, is actually one chromosome of a pair that has just replicated!. One of the chromosomes will stay will the mother cell, and the other will go to the daughter cell. These two chromosomes which are aligned and appear joined at their centers are called sister chromatids. These large DNA/protein complexes must be further packaged in the nucleus, as shown in the "Carl Saganesque" reducing view of the chromosome, a double stranded DNA molecule winds around a core of proteins.

Fun DNA Facts to Know and Tell

E. Coli Genome: 4.6 x 10⁶ BP (4.6 million BP)
Yeast Genome: 16 x 10⁶ BP
Smallest human chromosome (Y) 50 x 10⁶ BP
Worm: 100 x 10⁶ BP
Fruit Fly: 160 x 10⁶ BP
Largest human chromosome (1) 250 x 10⁶ BP
Entire human genome 3 x 10⁹ (3 billion) BP
Mouse Genome: 3 x 10⁹ BP
Length of uncoiled dsDNA in a human cell: approx 2 meters
Number of human cells: about 100 trillion
Number of times DNA from all human cells, if stretched out, could reach to sun and back: about 700
If compiled in books, the data would fill an estimated 200 volumes the size of a Manhattan telephone book (at 1000 pages each), and reading it would require 26 years working around the clock (Fig.14). The fruit fly genome would be 10 books, yeast 1 book, E. Coli 300 pages, and yeast chromosome 3 would be 14 pages.
Any two individuals differ in about three million - 3 x 10 ⁶ bases (0.1%). The population is now about 6 x 10⁹ (6 billion). A catalog of all sequence differences would require 15 x 10 ¹⁵ entries. This catalog may be needed to find the rarest or most complex disease genes.